[00:01.500 --> 00:05.240]  Hi, everyone. My name is Feng Xiao from Georgia Tech.
[00:05.240 --> 00:10.300]  It seems like I'm the first DEF CON safe mode main talk,
[00:10.300 --> 00:14.800]  and I really hope everyone can enjoy attending my work.
[00:14.920 --> 00:17.940]  So this is, of course, not only my work,
[00:17.940 --> 00:22.880]  but also the work of my wonderful co-authors from Georgia Tech and Texas A&M.
[00:22.880 --> 00:26.000]  Today, I'm going to talk about some interesting new vulnerabilities
[00:26.000 --> 00:28.480]  in the Node.js ecosystems.
[00:29.360 --> 00:31.820]  Before the talk normally begins,
[00:32.000 --> 00:34.840]  please let me introduce myself a little more.
[00:34.920 --> 00:38.280]  I am a CS PhD student at Georgia Tech.
[00:38.280 --> 00:41.180]  My research goal is about building automatic systems
[00:41.180 --> 00:44.200]  to detect and exploit vulnerabilities.
[00:44.300 --> 00:46.540]  We want the tools to exploit vulnerabilities
[00:46.540 --> 00:50.600]  because we want the people to know the existence
[00:50.600 --> 00:53.940]  and the consequence of their security bugs.
[00:54.120 --> 00:58.040]  And I do research in the web and application security.
[00:58.040 --> 01:01.780]  But I'm also researching security problems in other areas,
[01:01.780 --> 01:06.340]  such as the software defining networks and x86 virtualizations.
[01:07.100 --> 01:10.540]  Okay, so this is the topic I'm going to cover today.
[01:10.540 --> 01:13.820]  The talk will be divided into three parts.
[01:14.280 --> 01:17.620]  First, I will introduce the technical details of the new vulnerabilities
[01:17.620 --> 01:20.480]  and discuss their exploitation.
[01:20.840 --> 01:24.740]  Then, I will talk more about the bug-finding parts,
[01:24.740 --> 01:27.360]  which is about the lessons and the insights
[01:27.360 --> 01:34.660]  from building one kind of a tool to detect and exploit HPA.
[01:34.960 --> 01:40.320]  And in the last part, I will give an impact analysis of the new risk
[01:40.320 --> 01:43.800]  and some evaluation data about the tools.
[01:45.300 --> 01:47.700]  So first of all, let's take a quick look
[01:47.700 --> 01:51.140]  at the vulnerabilities we found during our research.
[01:51.920 --> 01:56.380]  So in total, we discovered 13 zero-day vulnerabilities
[01:56.380 --> 01:58.680]  from widely used programs,
[01:58.680 --> 02:02.640]  such as MongoDB official drivers, class validators.
[02:02.900 --> 02:06.900]  And these bugs can be exploited to launch serious attacks,
[02:06.900 --> 02:10.660]  such as leaking credential data, bypassing security checks,
[02:10.660 --> 02:12.640]  and denial of services.
[02:13.120 --> 02:17.460]  So before we touch the technical details of these vulnerabilities,
[02:18.180 --> 02:23.280]  let's have a brief background introduction about Node.js.
[02:24.260 --> 02:28.480]  Node.js is for executing JavaScript code outside of browsers.
[02:28.900 --> 02:33.140]  The picture on the right is the overall system diagram of Node.js.
[02:33.260 --> 02:36.460]  To interrupt and execute JavaScript,
[02:36.800 --> 02:40.360]  the Node.js implements a runtime engine based on Chrome V8
[02:40.360 --> 02:45.300]  to satisfy the needs of the server-side languages.
[02:46.160 --> 02:49.480]  The runtime engine also provides a set of API
[02:49.480 --> 02:52.940]  to let JavaScript interact with the host environments.
[02:53.280 --> 02:56.620]  By providing such APIs,
[02:56.620 --> 02:59.340]  the JavaScript code can access the host environments
[02:59.340 --> 03:01.980]  like any other server-side languages.
[03:01.980 --> 03:05.000]  For example, it can read and write file systems
[03:05.000 --> 03:07.980]  or execute system commands.
[03:08.280 --> 03:10.560]  So Node.js is pretty powerful.
[03:12.100 --> 03:15.520]  Nowadays, many websites are deploying Node.js.
[03:15.520 --> 03:18.360]  For example, Node.js are intensively used
[03:18.360 --> 03:21.300]  in companies like PayPal and LinkedIn.
[03:21.300 --> 03:25.000]  Also, we are all using a lot of electronic apps
[03:25.000 --> 03:28.740]  such as Skype or Skogs, Discord.
[03:28.740 --> 03:33.280]  And such electronic apps are also based on Node.js runtime.
[03:34.540 --> 03:37.640]  We've seen so many Node.js applications.
[03:37.640 --> 03:42.580]  The web-based apps are one of the most common types of Node.js programs.
[03:42.940 --> 03:45.620]  For those web-based applications,
[03:45.960 --> 03:49.780]  taking the communication data into object representations like JSON
[03:49.780 --> 03:53.560]  is pretty common.
[03:53.780 --> 03:56.100]  And this feature is convenient.
[03:56.660 --> 03:59.640]  For example, Node.js can use this feature
[03:59.640 --> 04:03.660]  to send and receive very complex data structure.
[04:03.760 --> 04:07.280]  From the monthly download statistic picture on the right,
[04:07.280 --> 04:11.020]  we can have an idea of how the object sharing is being supported
[04:11.020 --> 04:14.060]  and used by the Node.js ecosystems.
[04:15.740 --> 04:20.200]  The diagram demonstrates how the object sharing is being used
[04:20.200 --> 04:22.820]  in the Node.js ecosystem.
[04:22.860 --> 04:28.880]  There are two major methods of serialization of object data.
[04:28.880 --> 04:32.180]  The first is the query-stream-based serialization,
[04:32.180 --> 04:34.920]  and the second is the JSON-based serialization.
[04:34.940 --> 04:36.500]  As shown in the picture,
[04:36.500 --> 04:40.940]  if the user wants to update the age information in the Node.js web,
[04:40.940 --> 04:45.320]  he can send his data either via a standard query stream in the URL
[04:45.740 --> 04:49.080]  or a JSON stream in the request body.
[04:49.140 --> 04:51.120]  Upon receiving the request,
[04:51.120 --> 04:54.580]  the web application will convert the JSON data
[04:54.580 --> 04:57.580]  or the query stream data into an object
[04:57.580 --> 05:02.720]  so that the object can further propagate in the program logics.
[05:03.380 --> 05:07.060]  Okay, so basically this is how the object sharing is carried out
[05:07.060 --> 05:09.280]  in Node.js ecosystems.
[05:09.280 --> 05:13.380]  Usually, if we want to evaluate the security of such a program,
[05:13.380 --> 05:17.220]  we may want to inject different payloads into the age field
[05:17.220 --> 05:21.840]  so that we can try to trigger certain vulnerabilities
[05:21.840 --> 05:25.680]  such as SQL injections or cross-site scripts.
[05:26.660 --> 05:32.360]  However, what if we choose not to test the existing data field?
[05:32.800 --> 05:35.940]  Since we can pass an object into the program,
[05:35.940 --> 05:38.620]  what will happen if we inject additional properties
[05:38.620 --> 05:42.860]  that are not expected to be received by the server program?
[05:42.860 --> 05:46.580]  In particular, if an attacker can send the properties
[05:46.580 --> 05:51.040]  that forge or override the certain internal program states,
[05:51.040 --> 05:53.980]  the attacker may easily obtain dangerous abilities
[05:54.360 --> 05:58.220]  to manipulate the key program logics.
[05:59.500 --> 06:04.820]  So we are going to introduce the hidden property abusing, HPA.
[06:04.820 --> 06:08.700]  So the hidden property abusing leverages the object sharing in Node.js
[06:08.700 --> 06:12.680]  to tamper or forge critical program states.
[06:12.680 --> 06:16.540]  We call the additional properties we inject as the hidden properties
[06:16.540 --> 06:20.240]  because these properties are kind of like some hidden parameters
[06:20.560 --> 06:24.200]  which is valid to the endpoint user API.
[06:24.620 --> 06:29.480]  These parameters are associated with certain internal program states.
[06:29.480 --> 06:32.160]  However, nobody knows their existence
[06:32.160 --> 06:36.360]  until an attacker leverages HPA to tamper the states.
[06:37.000 --> 06:40.480]  In this talk, we mainly focus on the server-side scenarios
[06:40.480 --> 06:44.920]  where a remote attacker wants to attack the Node.js web applications
[06:44.920 --> 06:47.240]  or some microservices.
[06:47.420 --> 06:49.600]  To explore the vulnerabilities,
[06:49.600 --> 06:52.580]  the attacker will access the legitimate interfaces
[06:52.580 --> 06:56.980]  such as the web API endpoint to send his payloads.
[06:56.980 --> 07:00.100]  In most cases, the attacker payloads
[07:00.100 --> 07:02.600]  should be in the form of planned object,
[07:02.600 --> 07:07.040]  which is the simplest object representation in Node.js.
[07:08.860 --> 07:15.280]  During our research, we discovered two types of typical attack vectors of HPA.
[07:15.660 --> 07:20.400]  We call the first one as the app-specific attribute manipulation.
[07:20.520 --> 07:24.180]  This one is for manipulating certain internal properties
[07:24.180 --> 07:27.000]  defined by the application themselves.
[07:27.000 --> 07:30.200]  Search internal properties are supposed to be initialized
[07:30.200 --> 07:33.180]  and managed by their internal functions.
[07:33.180 --> 07:39.140]  However, they usually represent certain internal states of the programs.
[07:39.240 --> 07:40.960]  So as shown in the picture,
[07:40.960 --> 07:44.560]  the INIT row is an internal function that is responsible
[07:44.560 --> 07:49.060]  for maintaining the access right on the user object.
[07:49.300 --> 07:51.400]  However, with HPA,
[07:51.400 --> 07:56.360]  the attacker can propagate a conflicting name property to the user object.
[07:56.360 --> 08:01.720]  And thus control the internal states of the program.
[08:01.900 --> 08:03.680]  As shown in the picture,
[08:03.680 --> 08:08.260]  the program also provides an API called update.
[08:08.260 --> 08:10.740]  This is for external usage.
[08:11.060 --> 08:16.620]  However, if a malicious user injects additional key-value pair,
[08:16.620 --> 08:22.160]  which is access and admin in the picture to the API,
[08:22.160 --> 08:29.660]  then the additional properties will override the existing access right.
[08:30.000 --> 08:31.800]  So this payload is pretty useful
[08:31.800 --> 08:37.040]  when we want to abuse certain concrete logics in large applications,
[08:37.040 --> 08:42.760]  such as some older information or user-privileged any management logics.
[08:43.340 --> 08:48.000]  Also, HPA can target some unique JavaScript schemes,
[08:48.000 --> 08:50.240]  such as prototyping.
[08:50.240 --> 08:55.420]  We call the second attack scenarios as prototype inheritance hijacking.
[08:56.220 --> 09:00.700]  In JavaScript, every object has a link to a prototype object.
[09:00.860 --> 09:05.640]  When the JavaScript code wants to access a property of an object,
[09:05.640 --> 09:09.200]  the property will not only be searched on the object,
[09:09.200 --> 09:11.340]  but also the prototype of the object,
[09:11.340 --> 09:14.000]  and even the prototype of the prototype,
[09:14.000 --> 09:17.260]  until a property with a matching name is found.
[09:17.740 --> 09:19.360]  As shown in the picture,
[09:19.360 --> 09:25.140]  when the JavaScript code wants to get the constructor property from the input object,
[09:25.140 --> 09:29.160]  it will first search locally within the input object.
[09:29.160 --> 09:32.780]  Since there is no property named constructor here,
[09:32.780 --> 09:36.240]  the code will continually search to his prototype,
[09:36.240 --> 09:40.060]  where the constructor is really located.
[09:40.760 --> 09:44.380]  So with HPA, we can hijack the inheritance chain
[09:44.380 --> 09:49.700]  and forge our own payloads as the internal properties on the chain.
[09:49.900 --> 09:51.400]  As shown in the picture,
[09:51.400 --> 09:54.300]  if we inject a property named constructor,
[09:54.300 --> 09:57.120]  the search process will be very different.
[09:57.120 --> 10:01.980]  Since there is already a property named constructor within the input object,
[10:01.980 --> 10:03.660]  the search will immediately stop
[10:05.000 --> 10:09.180]  and end up with retaining a user-controlled value.
[10:09.300 --> 10:11.620]  As demonstrated in the red circle,
[10:11.620 --> 10:14.580]  the value of the constructor will become a string,
[10:14.580 --> 10:18.600]  rig and multi, rather than a normal JavaScript function.
[10:19.280 --> 10:22.140]  So the second attack vector is really useful
[10:22.140 --> 10:26.100]  because we found many JavaScript developers
[10:26.100 --> 10:29.640]  tend to trust the properties inherited from prototypes
[10:29.640 --> 10:34.700]  and make many security-sensitive decisions based on them.
[10:34.940 --> 10:37.400]  Also, we should be aware of the differences
[10:37.400 --> 10:42.220]  between these attack vectors with the prototype pollution.
[10:42.240 --> 10:45.900]  The two attacks are totally different.
[10:45.900 --> 10:48.580]  The prototype pollution, as the name suggests,
[10:48.580 --> 10:52.540]  is about tampering the prototype object.
[10:52.540 --> 10:58.760]  However, our attack vector does not modify the prototype object.
[11:00.650 --> 11:03.690]  The root cause of the HPA is that
[11:03.690 --> 11:07.990]  the Node.js fails to isolate the unsafe object,
[11:07.990 --> 11:12.610]  such as the user input, from the critical internal states.
[11:13.030 --> 11:16.010]  So, to make a clear classification,
[11:16.010 --> 11:19.030]  the HPA can be seen as a new security risk
[11:19.030 --> 11:22.710]  under the Common Weakness Category 915,
[11:22.710 --> 11:26.730]  whose child variants are all about improper modification
[11:26.730 --> 11:29.770]  of the dynamic object attributes.
[11:30.290 --> 11:32.910]  As shown in the hierarchy tree on the right,
[11:32.910 --> 11:36.390]  there are some similar issues on other language platforms,
[11:36.390 --> 11:40.730]  such as Ruby Math Assignment, the PHP Object Injections.
[11:40.870 --> 11:44.330]  Although these variants share the same behavior feature
[11:44.330 --> 11:46.150]  dominated by 915,
[11:46.510 --> 11:49.850]  they all have their own data patterns
[11:49.850 --> 11:51.930]  due to the language differences.
[11:52.170 --> 11:55.010]  For example, the Ruby Math Assignment
[11:55.010 --> 11:57.330]  is a set of vulnerabilities discovered
[11:57.330 --> 12:00.790]  in widely used Ruby web application framework
[12:00.790 --> 12:02.810]  called Ruby on Rails.
[12:03.170 --> 12:08.190]  Unlike HPA, the attacker does not pass objects into the program.
[12:08.190 --> 12:10.830]  Instead, the attacker abuses
[12:11.970 --> 12:15.050]  a framework-specific assignment feature in Ruby on Rails
[12:15.050 --> 12:17.530]  to modify certain existing objects
[12:17.530 --> 12:20.110]  on the right side of the assignment.
[12:20.170 --> 12:24.150]  And the payloads between the two attacks are also different.
[12:24.190 --> 12:27.430]  The Math Assignment payloads are literal value.
[12:27.430 --> 12:30.150]  However, HPA can introduce hidden properties
[12:30.150 --> 12:34.810]  with either literal value or nested object form.
[12:34.810 --> 12:38.290]  More importantly, the Ruby is strongly typed,
[12:38.290 --> 12:39.950]  so the Math Assignment vulnerabilities
[12:39.950 --> 12:43.210]  cannot introduce new properties to the objects.
[12:43.210 --> 12:48.490]  However, the HPA can inject arbitrary properties,
[12:48.490 --> 12:52.210]  which make HPA really flexible and powerful.
[12:53.710 --> 12:57.540]  OK, so with several pages of concept introduction,
[12:57.540 --> 13:00.940]  I think it's time we can hack some real targets.
[13:01.440 --> 13:06.340]  In this example, we target at a popular web framework
[13:06.340 --> 13:08.340]  named Routing Controller.
[13:08.360 --> 13:10.840]  We will attack its official example code
[13:10.840 --> 13:15.960]  to demonstrate an end-to-end prototype inheritance hijack exploit,
[13:15.960 --> 13:19.500]  from security check bypassing to cycle injection.
[13:20.340 --> 13:23.060]  The figure on the right can give you a brief idea
[13:23.060 --> 13:24.820]  on how our example works.
[13:24.820 --> 13:28.060]  In the example, a server program is deployed
[13:28.060 --> 13:30.060]  using Routing Controllers.
[13:30.680 --> 13:34.200]  If a remote user wants to authenticate with the server,
[13:34.200 --> 13:37.300]  his data will flow into the following components.
[13:37.420 --> 13:39.960]  First, he will send his serialized data
[13:39.960 --> 13:43.140]  into the authentication module.
[13:43.220 --> 13:47.800]  Then, the authentication module will instantiate the objects
[13:47.800 --> 13:49.460]  according to the JSON he provides
[13:49.460 --> 13:52.280]  and send it to the param handler.
[13:52.280 --> 13:57.080]  Here, we use the green box to demonstrate the user data objects.
[13:57.080 --> 14:00.340]  The param handler is responsible for ensuring
[14:00.340 --> 14:03.320]  user input object is not malicious.
[14:03.620 --> 14:08.180]  The handler will first collect internal format specification object,
[14:08.180 --> 14:10.460]  which is the blue box in the picture,
[14:10.460 --> 14:14.660]  and he will merge the specification with the input object
[14:15.140 --> 14:18.320]  and invoke the input validation API.
[14:18.320 --> 14:22.820]  The input validation API will sanitize the user input data
[14:22.820 --> 14:25.220]  according to the format specification.
[14:25.300 --> 14:30.020]  In this case, it will check if the email field is legitimate or not.
[14:30.020 --> 14:35.320]  If the check passes, the user object will flow into the database.
[14:35.600 --> 14:38.100]  Okay, so this is the overall data flow.
[14:38.100 --> 14:42.140]  Let's analyze how we can attack the logic step by step.
[14:43.000 --> 14:47.080]  So, the first step is the hidden property injection,
[14:47.080 --> 14:54.420]  in which the malicious attacker injects hidden properties in his request,
[14:54.420 --> 14:57.320]  which is the constructor in this case.
[14:57.320 --> 15:02.720]  As shown in the picture, when the server program instantiates the user object,
[15:02.720 --> 15:07.500]  there will be an additional property named the constructor,
[15:07.500 --> 15:13.580]  which is a payload for bypassing the input validation module.
[15:13.580 --> 15:20.820]  So, in the second step, the program will prepare the parameters needed by the input validation API.
[15:20.820 --> 15:29.280]  The server program will merge the user input, which is the param, with an object named the schema.
[15:29.280 --> 15:36.900]  The merging operation is carried out by putting every property from the param objects into the schema object.
[15:36.900 --> 15:41.280]  So, this process is very much like object assignment.
[15:41.280 --> 15:47.280]  To simplify the demonstration, let's just use object assignment in this example.
[15:47.360 --> 15:55.060]  By performing such merging operations, the hidden property constructor will also transform to the schema.
[15:55.540 --> 16:02.440]  And after the transformation, we now can hijack the inheritance of the constructor on the schema.
[16:02.440 --> 16:09.460]  Actually, the constructor of the schema plays a very important role in the input validation module.
[16:09.460 --> 16:19.120]  As shown in the picture, the constructor actually stores the important format restriction information.
[16:19.120 --> 16:27.900]  As a result, the merging operation enables us to hijack the inheritance of these important format restrictions.
[16:27.900 --> 16:33.500]  As shown in the picture, when the constructor is read by the getSchema function,
[16:33.500 --> 16:38.860]  our hidden property will be immediately matched and returned to the code.
[16:38.860 --> 16:45.660]  To bypass the input validation, we just need to set the format specification as an invalid value,
[16:45.660 --> 16:49.700]  so that our SQL injection payload can escape the check.
[16:50.020 --> 16:53.740]  Now, the last step is much more straightforward.
[16:53.740 --> 16:59.600]  The validate payload then flows into the sensitive API to finish the entire attack.
[16:59.720 --> 17:03.960]  So, this is how an entire HPA chain looks like.
[17:03.960 --> 17:09.040]  Actually, the code logics behind vulnerability is much more complex.
[17:09.040 --> 17:17.320]  For example, the input validation module contains 30,000 lines of code.
[17:17.320 --> 17:23.100]  So, it would be helpful if we can find a tool that automatically track these data structures
[17:23.100 --> 17:27.980]  and discover time for hidden properties exist in the program logics.
[17:28.700 --> 17:31.920]  So, what are the challenges of building such a tool?
[17:31.920 --> 17:34.060]  First of all, it is JavaScript.
[17:34.060 --> 17:38.180]  Analyzing JavaScript is known to be hard due to the dynamic nature.
[17:39.160 --> 17:46.320]  And second, HPA is a tech that creates unexpected and new data dependencies.
[17:46.360 --> 17:56.080]  However, program analysis, such as data flow tracking, is mainly for analyzing the existing flows.
[17:56.080 --> 18:02.340]  Third, from our run examples, we can observe that HPA tampers internal program states.
[18:02.760 --> 18:08.260]  So, the attack effects highly depend on the roles of the compromised states.
[18:08.260 --> 18:11.020]  This makes the detection more challenging.
[18:11.440 --> 18:18.660]  To overcome these challenges, we design and implement Lynx, a hybrid JavaScript program analysis tool,
[18:18.660 --> 18:22.420]  to detect and exploit HPA vulnerabilities.
[18:22.700 --> 18:25.300]  Lynx mainly consists of two components.
[18:26.160 --> 18:31.980]  The first component on the left is for identifying potential hidden properties.
[18:31.980 --> 18:43.440]  It combines dynamic data flow tracking and static synthetics analysis to track all the user input and infer potential candidates.
[18:43.440 --> 18:50.400]  And the component on the right is for detecting the harmful hidden properties and generating exploits for them.
[18:50.400 --> 19:01.360]  To help future Node.js security research, we decided to open-source our Lynx project at the GitHub links in the bottom.
[19:02.100 --> 19:08.920]  So, if you are interested in the technical details, you can check it in the GitHub report.
[19:09.760 --> 19:14.660]  So, the very first thing Lynx will do is dynamic data flow tracking.
[19:14.660 --> 19:22.520]  First of all, Lynx will generate a label object, which is a unique key and value piv.
[19:22.980 --> 19:27.880]  Lynx will inject the label into the input data of the program.
[19:28.300 --> 19:38.040]  Since different properties from the input object may flow into different program logics, we want to track all these propagations.
[19:38.040 --> 19:43.820]  So, we perform the label injection in a recursive manner.
[19:43.900 --> 19:53.640]  That is, as shown in the picture on the left, Lynx will generate three different inputs by label injecting the original test case.
[19:53.640 --> 19:57.980]  In each time, Lynx will inject the label into our different properties.
[19:58.740 --> 20:03.280]  After the injection, Lynx will observe the program execution.
[20:03.280 --> 20:10.380]  We leverage a JavaScript analysis framework called Jalanki to instrument our test programs.
[20:10.380 --> 20:18.500]  Since we are studying the data flow, we instrument the variable reads and writes, object property indexing and function calls.
[20:18.500 --> 20:21.060]  Then, we execute the test program.
[20:21.060 --> 20:26.040]  During the execution, Lynx will examine every object within the data flow.
[20:26.040 --> 20:32.520]  If an object can't carry our property labels, we will record it for further analysis.
[20:33.440 --> 20:37.380]  So now we have a list of property carriers.
[20:37.580 --> 20:46.080]  Record that an object is flagged as a property carrier because we detect our injected label under his body.
[20:46.080 --> 20:56.420]  So if we can propagate our label here, maybe we can also propagate another malicious property here also.
[20:56.420 --> 21:05.280]  More specifically, if we can inject a property that has a conflicting name with certain internal properties the program has,
[21:05.280 --> 21:09.360]  maybe we can control that property by overwriting them.
[21:10.840 --> 21:22.340]  So now we want to extract all the child properties from the original programs of the property carriers and flag them as hidden property candidates.
[21:22.340 --> 21:32.040]  To achieve this goal, we need static syntax analysis to extract the necessary syntactic information from the code.
[21:32.040 --> 21:37.460]  The picture on the right demonstrates how we parse our statement from our running example.
[21:37.460 --> 21:45.500]  Lynx will traverse the syntax tree until reaching a property carrier, which is circled by the red line in the graph.
[21:45.500 --> 21:53.800]  Then it records all the properties under the carrier. In our case, the hidden property candidate is the constructor.
[21:55.780 --> 22:01.140]  So here is an output screenshot of the first component.
[22:01.140 --> 22:10.800]  As you can observe, the Lynx will first instrument the code base, and then we tracked 43 property carriers.
[22:10.800 --> 22:17.940]  As indicated by the red circle, Lynx successfully detects a hidden property named constructor.
[22:19.520 --> 22:25.240]  So in the previous component, we discovered the key name of the potential hidden properties.
[22:25.240 --> 22:30.440]  By injecting the property with the same key, we might overwrite certain internal states.
[22:30.440 --> 22:36.240]  However, we still don't know whether the candidates can be controlled or not.
[22:36.240 --> 22:41.200]  And we also don't know how to introduce attack effects with these candidates.
[22:41.220 --> 22:44.340]  So apparently, Lynx could do more.
[22:46.060 --> 22:54.260]  Let's revisit our running example to see if there is any insights to help us to design such an exploitation component.
[22:54.340 --> 22:58.980]  The figure on the left is the vulnerable code from our running example.
[22:58.980 --> 23:04.680]  As we have discussed many times, hidden property tampers internal program states,
[23:04.680 --> 23:09.960]  which means HPA exploitation highly relates it to the code context.
[23:10.060 --> 23:15.400]  So it is important to conclude a set of vulnerable sensitive behaviors.
[23:15.400 --> 23:19.800]  These behaviors should clearly indicate certain security consequences,
[23:19.800 --> 23:24.880]  so that we can decouple the harmfulness detection from the code context.
[23:24.880 --> 23:31.760]  Also, from the running example, the exploitation is mainly about manipulating the return result.
[23:32.540 --> 23:36.660]  More specifically, there are two possible paths here.
[23:36.660 --> 23:43.400]  If the execution enters the branch on line 19, we will get a validation failed.
[23:43.400 --> 23:49.100]  But if we can go into the line 21, we can successfully pass the check.
[23:49.100 --> 23:55.860]  So the exploitation point and the override point may not be the same place,
[23:55.860 --> 24:00.100]  which means we shouldn't stop our analysis at line 11.
[24:00.100 --> 24:07.440]  Instead, we should continue exploring all the possible paths that can be triggered by manipulating the hidden properties.
[24:08.420 --> 24:13.580]  So we studied and concluded six general types of sensitive things.
[24:14.560 --> 24:19.960]  Due to the time constraint, I will not introduce the details of each type.
[24:19.960 --> 24:23.600]  If you want to know more, you can check our Git report.
[24:24.960 --> 24:28.160]  So after defining our sensitive things,
[24:28.160 --> 24:33.300]  we want the hidden property to trigger as many as possible branches,
[24:33.300 --> 24:36.420]  and monitor whether we can hit a sink.
[24:36.420 --> 24:42.480]  To achieve this goal, we use symbolic execution to explore all the hidden property value space.
[24:42.480 --> 24:49.920]  Lynx first generates an exploit template that can reach the potential hidden vulnerable property.
[24:50.080 --> 24:53.680]  We denote such data structure as an exploit template,
[24:54.020 --> 25:01.180]  because Lynx does not specify a concrete value in the input.
[25:01.180 --> 25:08.840]  Instead, we insert a special placeholder, which will be used by the symbolic execution later on.
[25:08.840 --> 25:13.880]  Then, we run the test program with our constructed templates,
[25:13.880 --> 25:18.020]  and symbolically execute the hidden property.
[25:18.020 --> 25:25.960]  As shown in the picture, the Lynx will explore all the path constraints along the input path,
[25:25.960 --> 25:31.680]  and if Lynx finds that it has hit a sink,
[25:31.680 --> 25:35.800]  for example, in this example, it hit the sink i2,
[25:35.800 --> 25:43.860]  then he will fetch the corresponding payload that can trigger the sink as the final exploit.
[25:44.240 --> 25:55.540]  So a little background about the sink i2 is the sink we define to detect that whether our input can manipulate the return value of a module or not.
[25:57.990 --> 26:01.610]  So this is the output of the exploit module.
[26:01.610 --> 26:09.790]  From the circled area, we can observe that the key value pair constructN1 triggered the sink i2.
[26:09.870 --> 26:19.270]  In the last line, we can find that the Lynx successfully bypassed the generator exploit that can lead to the successful validation.
[26:19.710 --> 26:23.010]  So this is pretty much about how our system works.
[26:23.010 --> 26:26.850]  Let's see some interesting new results of our research.
[26:26.850 --> 26:33.150]  During our research, we choose 60 widely used programs from Node NPM.
[26:33.150 --> 26:36.690]  There are 55 modules and 5 web applications.
[26:36.990 --> 26:42.910]  And with the help of Lynx, we tracked more than 1,300 carriers,
[26:42.910 --> 26:49.330]  and detected more than 300 hidden property candidates associated with those carriers.
[26:49.330 --> 26:53.130]  In the end, we confirmed 13 zero-day vulnerabilities.
[26:53.130 --> 26:59.510]  With the help of symbolic execution, Lynx even synthesized 10 exploits automatically.
[27:00.310 --> 27:03.630]  So how is the impact of these vulnerabilities?
[27:03.670 --> 27:07.550]  We found that HPA can introduce various attack effects,
[27:07.550 --> 27:15.030]  such as leaking credential data, denial of services, or bypassing security checks.
[27:15.030 --> 27:22.530]  Based on the impact analysis, we can observe that the HPA can compromise previously unreachable program states,
[27:22.530 --> 27:26.670]  which effectively enlarge the attack surface.
[27:26.850 --> 27:32.110]  Even more, we could notice that HPA is not a simple input validation issue,
[27:32.110 --> 27:36.730]  and many input validators themselves are also vulnerable to HPA.
[27:37.170 --> 27:45.230]  So in the following slides, I will pick up two interesting vulnerabilities from our results and case study them.
[27:45.910 --> 27:49.970]  So the first case comes from MongoDB official driver.
[27:49.970 --> 27:54.930]  We found that we can tamper an internal state named the BSON type.
[27:55.090 --> 28:05.270]  Our background here is that the MongoDB leveraged the internal state BSON type to indicate the data type of the query object.
[28:05.270 --> 28:13.190]  However, when serializing the query object, MongoDB will ignore the object with unknown BSON type.
[28:13.330 --> 28:18.270]  So what if we abuse the logic for a query condition object?
[28:18.270 --> 28:22.530]  The code on the right is from an open source online game.
[28:22.530 --> 28:27.610]  The online game used a vulnerable API to implement the user management logics.
[28:27.650 --> 28:32.610]  As shown in the picture, by injecting the unknown BSON type to the input,
[28:32.610 --> 28:36.990]  the attacker can force the MongoDB not serializing the query object condition,
[28:36.990 --> 28:42.730]  so that the MongoDB will always return the first user on top of the database.
[28:42.730 --> 28:51.190]  With this ability, the attacker can log in or delete arbitrary accounts.
[28:52.590 --> 28:57.250]  The second case is from another widely used in-memory database.
[28:57.250 --> 29:02.990]  The hidden property is more like a backdoor which helps the user accessing the sensitive data.
[29:02.990 --> 29:14.010]  In TafiDB, we discover a hidden property named id, which is an internal index for each database data item.
[29:14.010 --> 29:24.370]  Once we specify our own id in the query, TafiDB will ignore other query conditions and directly return the result associated with the index.
[29:24.370 --> 29:29.510]  As shown in the picture, even though we got wrong password and username,
[29:29.510 --> 29:37.670]  we can still leak the valid user data from the database with our crafted hidden properties.
[29:38.450 --> 29:43.430]  Thanks for attending our talk. I hope you guys keep safe in this special year.
